The TALP n-gram-based SMT system for IWSLT 2006

نویسندگان

  • Josep Maria Crego
  • Adrià de Gispert
  • Patrik Lambert
  • Maxim Khalilov
  • Marta R. Costa-Jussà
  • José B. Mariño
  • Rafael E. Banchs
  • José A. R. Fonollosa
چکیده

This paper describes TALPtuples, the 2006 Ngrambased statistical machine translation system developed at the TALP Research Center of the UPC (Universitat Politècnica de Catalunya) in Barcelona. Emphasis is put on improvements and extensions of the system of previous years, being highlighted and empirically compared. Mainly, these include a novel and much more efficient word ordering strategy based on reordering patterns, a linguistically-guided tuple segmentation criterion and improved optimization procedures. The paper provides details of this system participation in the third International Workshop on Spoken Language Translation (IWSLT) held in Kyoto, Japan in November 2006. Results on four translation directions are reported, namely from Arabic, Chinese, Italian and Japanese into English for the open data track, thoroughly explaining all language-related preprocessing and optimization schemes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The TALP ngram-based SMT system for IWSLT'05

This paper provides a description of TALP-Ngram, the tuple-based statistical machine translation system developed at the TALP Research Center of the UPC (Universitat Politècnica de Catalunya). Briefly, the system performs a log-linear combination of a translation model and additional feature functions. The translation model is estimated as an N-gram of bilingual units called tuples, and the fea...

متن کامل

The TALP n-gram-based SMT system for IWSLT 2007

This paper describes TALPtuples, the 2007 N -gram-based statistical machine translation system developed at the TALP Research Center of the UPC (Universitat Politècnica de Catalunya) in Barcelona. Emphasis is put on improvements and extensions of the system of previous years. Mainly, these include optimizing alignment parameters in function of translation metric scores and rescoring with a neur...

متن کامل

TALP phrase-based system and TALP system combination for IWSLT 2006

This paper describes the TALP phrase-based statistical machine translation system, enriched with the statistical machine reordering technique. We also report the combination of this system and the TALP-tuple, the n-gram-based statistical machine translation system. We report the results for all the tasks (Chinese, Arabic, Italian and Japanese to English) in the framework of the third evaluation...

متن کامل

The TALP&I2r SMT systems for IWSLT 2008

This paper gives a description of the statistical machine translation (SMT) systems developed at the TALP Research Center of the UPC (Universitat Politècnica de Catalunya) for our participation in the IWSLT’08 evaluation campaign. We present Ngram-based (TALPtuples) and phrase-based (TALPphrases) SMT systems. The paper explains the 2008 systems’ architecture and outlines translation schemes we ...

متن کامل

N-gram-based machine translation enhanced with neural networks for the French-English BTEC-IWSLT'10 task

Neural Network Language Models (NNLMs) have been applied to Statistical Machine Translation (SMT) outperforming the translation quality. N -best list rescoring is the most popular approach to deal with the computational problems that appear when using huge NNLMs. But the question of “how much improvement could be achieved in a coupled system” remains unanswered. This open question motivated som...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006